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Method and apparatus for obtaining auditory and gestural feedback in a recommendation 
system 



Field of the Invention 

. The present invention relates to recommendation systems, such as 
recommenders for television programming or other content, and more particularly, to a 
method and apparatus for updating one or more profiles in such as recommendation system 
5 based on auditory or gestural feedback obtained from the user. 

Background of the Invention 

The number of media options available to individuals is increasing at an 
exponential pace. As the number of channels available to television viewers has increased, 

1 0 for example, along with the diversity of the programming content available on such channels, 
it has become increasingly challenging for television viewers to identify television programs 
of interest. Historically, television viewers identified television programs of interest by 
analyzing printed television program guides. Typically, such printed television program 
guides contained grids listing the available television programs by time and date, channel and 

1 5 title. As the number of television programs has increased, it has become increasingly 
difficult to effectively identify desirable television programs using such printed guides. 

More recently, television program guides have become available in an 
electronic format, often referred to as electronic program guides (EPGs). Like printed 
television program guides, EPGs contain grids listing the available television programs by 

20 time and date, channel and title. Some EPGs, however, allow television viewers to sort or 
search the available television programs in accordance with personalized preferences. In 
addition, EPGs allow for on-screen presentation of the available television programs. 

While EPGs allow viewers to identify desirable programs more efficiently 
than conventional printed guides, they suffer from a number of limitations, which if 

25 overcome, could further enhance the ability of viewers to identify desirable programs. For 
example, many viewers have a particular preference towards, or bias against, certain 
categories of programming, such as action-based programs or sports programming. Thus, the 
viewer preferences can be applied to the EPG to obtain a set of recommended programs that 
may be of interest to a particular viewer. 
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Thus, a number of tools have been proposed or suggested for recommending 
television programming. The Tivo™ system, for example, commercially available from Tivo, 
Inc., of Sunnyvale, California, allows viewers to rate shows using a "Thumbs Up and 
Thumbs Down" feature and thereby indicate programs that the viewer likes and dislikes, 
respectively. In this manner, the Tivo™ system implicitly derives the viewer's preferences 
from previous television programs that the viewer liked or did not like. Thereafter, the TiVo 
receiver matches the recorded viewer preferences with received program data, such as an 
EPG, to make recommendations tailored to each viewer. 

Implicit television program recommenders generate television program 
recommendations based on information derived from the viewing history of the viewer, in a 
non-obtrusive manner. Explicit television program recommenders, on the other hand, 
explicitly question viewers about their preferences for program features, such as title, genre, 
actors, channel and date/time, to derive viewer profiles and generate recommendations. 

While such television program recommenders identify programs that are likely 
of interest to a given viewer, they suffer from a number of limitations, which if overcome, 
could further improve the quality of the generated program recommendations. For example, 
the Tivo™ system obtains an explicit indication from the viewer of whether a given watched 
program was liked or disliked, which is then used to derive the viewing preferences of the 
user. The Tivo™ system depends on the affirmative action of the user to indicate whether a 
given watched program was liked or disliked, using the "Thumbs Up" or "Thumbs Down" 
indicator. 

If the user fails to affirmatively indicate whether a given watched program was 
liked or disliked, the Tivo™ system will assume that the user did not like the watched 
program. Thus, the Tivo™ system may make false assumptions regarding the viewing 
preference information associated with the viewing session. In addition, the Tivo™ system 
typically requires the user to enter the "Thumbs Up" or "Thumbs Down" indicator using the 
remote control or set-top terminal, which may not be readily accessible or convenient 

A need therefore exists for a method and apparatus for obtaining feedback 
from a user that can determine or infer whether a given user liked or disliked certain content 
based on the behavior of the user. A further need exists for a method and apparatus for 
evaluating the reaction of a viewer to presented content in real-time and for deriving whether 
or not the viewer liked or disliked the presented content. Yet another need exists for a 
method and apparatus for a recommendation system that permits the user to indicate the 
strength of the user's preferences. Finally, a need exists for a method and apparatus for 
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evaluating the reaction of a viewer to presented content that derives the viewing preferences 
of the user from audio or video information, or both, rather than requiring a manual entry 
using a specific device. 

5 Summary of the Invention 

Generally, a method and apparatus are disclosed for updating a user profile in 
a recommendation system for a given user based on auditory or gestural feedback 
information provided by the user. One or more audio/visual capture devices are focused on 
the user to detect the auditory or gestural feedback. The detected auditory or gestural 

10 feedback may include, for example, predefined (i) auditory commands, (ii) gestural 
commands, (iii) facial expressions, or (iv) a combination of the foregoing, collectively 
referred to as "predefined behavioral feedback." 

Generally, the predefined behavioral feedback provides a score indicating the 
strength of the user's preferences, such as preferences for a given program or program 

15 feature. In addition, the feedback can be explicit, such as predefined auditory or gestural 
commands indicating the user's preferences (likes or dislikes), or implicit, such as 
information that may be derived from facial expressions or other behavior suggestive of the 
user's preferences. Once predefined behavioral feedback is identified, the present invention 
updates the corresponding user profile, in an appropriate manner. 

20 A more complete understanding of the present invention, as well as further 

features and advantages of the present invention, will be obtained by reference to the 
following detailed description and drawings. 

Brief Description of the Drawings 

25 FIG. 1 illustrates a television prograniming recommender in accordance with 

the present invention; 

FIG. 2 illustrates a sample table from the program database of FIG. 1; 
FIG. 3A illustrates a sample table from a Bayesian implementation of the 

viewer profile of FIG. 1; 
30 FIG. 3B illustrates a sample table from a viewing history used by a decision 

tree (DT) recommender; 

FIG. 3C illustrates a sample table from a viewer profile generated by a 
decision tree (DT) recommender from the viewing history of FIG. 3B; and 
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FIG. 4 is a flow chart describing an exemplary auditory and gestural feedback 
analysis process embodying principles of the present invention. 

Detailed Description 

FIG. 1 illustrates a television programming recommender 100 in accordance 
with the present invention. As shown in FIG. 1, the television programming recommender 
100 evaluates each of the programs in an electronic programming guide (EPG) 130 to 
identify programs of interest to one or more viewer(s) 140. The set of recommended 
programs can be presented to the viewer 140 using a set-top tenninal/television 160, for 
example, using well known on-screen presentation techniques. While the present invention is 
illustrated herein in the context of television programming recommendations, the present 
invention can be applied to any automatically generated recommendations that are based on 
an evaluation of user behavior, such as a viewing history or a purchase history. 

According to one feature of the present invention, the television programming 
recommender 100 determines whether the viewer likes or dislikes a given program based on 
auditory or gestural feedback from the viewer 140. The auditory or gestural feedback from • 
the viewer 140 can be (i) explicit, such as predefined auditory or gestural commands 
indicating whether the viewer liked or disliked the program (and, optionally, the extent to 
which the viewer liked or disliked the program); or (ii) implicit, such as information that may 
be derived from facial expressions that typically indicate whether the viewer liked or disliked 
the program. The given program can be a program currently being watched by the viewer 
140 or a program or program feature specified by the television prograiiuning recommender 
1 00, for example, in a query or survey. , . 

In this manner, since the user is not constrained from using the remote control 
or set-top terminal as an input mechanism, the present invention provides a flexible 
mechanism for allowing a user to indicate whether or not the viewer liked or disliked the 
program. In addition, the television programming recommender 100 can validate whether or 
not a viewer liked or disliked a given watched program through evaluation of behavioral 
conduct of the viewer, and not merely assume that a viewer liked a program because it was 
watched. 

As shown in FIG. 1, the television programming recommender 100 includes 
one or more audio/visual capture devices 150-1 through 150-N (hereinafter, collectively 
referred to as audio/visual capture devices 150) that are focused on the viewer 140. The 
audio/visual capture devices 1 50 may include, for example, a pan-tilt-zoom (PTZ) camera for 
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capturing video information or an array of microphones for capturing audio information, or 
both. 

The audio or video images (or both) generated by the audio/visual capture 
devices 150 are processed by the television programming recommender 100, in a manner 
discussed below in conjunction with FIG. 4, to identify one or more predefined (i) auditory 
commands, (ii) gestural commands, (iii) facial expressions, or (iv) a combination of the 
foregoing, from the viewer 140 (hereinafter, collectively referred to as "predefined 

behavioral feedback"). 

Once predefined behavioral feedback is identified, the television programming 
recommender 100 updates one or more viewer profiles 300, discussed below in conjunction 
with FIGS. 3A and 3C, in an appropriate manner. The viewer-supplied auditory or gestural 
feedback that is detected can correspond to, for example, a score indicating the strength of 
the viewer's like or dislike of the program or program feature. In addition, the detected 
auditory or gestural feedback is used by the television programming recommender 100 to 
update the corresponding viewer profile(s) 300. 

As shown in FIG. 1, the television progranuning recommender 100 contains a 
program database 200, one or more viewer profiles 300, and an auditory and gestural 
feedback analysis process 400, each discussed further below in conjunction with FIGS. 2 
through 4, respectively. Generally, the program database 200 records information for each 
program that is available in a given time interval. One illustrative viewer profile 300, shown 
in FIG. 3 A, is an explicit viewer profile that is typically generated from a viewer survey that 
provides a rating for each program feature, for example, on a numerical scale that is mapped 
to various levels of interest between "hates" and "loves," indicating whether or not a given 
viewer watched each program feature. Another exemplary viewer profile 300', shown in 
FIG. 3C, is generated by a decision tree recommender, based on an exemplary viewing 
history 360, shown in FIG. 3B. The present invention permits the survey response 
information, if any, recorded in the viewer profile 300 to be supplemented with the detected 
auditory or gestural feedback information. 

The auditory and gestural feedback analysis process 400 analyzes the audio or 
video images (or both) generated by the audio/visual capture devices 150 to identify 
predefined auditory or gestural feedback. Once predefined auditory or gestural feedback is 
identified, the auditory and gestural feedback analysis process 400 updates the viewer profile 
300 in an appropriate manner. 
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The television program recommender 100 may be embodied as any computing 
device, such as a personal computer or workstation, that contains a processor 120, such as a 
central processing unit (CPU), and memory 110, such as RAM and/or ROM. In addition, the 
television programming recommender 100 may be embodied as any available television 
program recommender, such as the Tivo™ system, commercially available from Tivo, Inc., 
of Sunnyvale, California, or the television program recommenders described in United States 
Patent Application Serial No. 09/466,406, filed December 17, 1999, entitled "Method and 
Apparatus for Recommending Television Programming Using Decision Trees," (Attorney 
Docket No. 700772), United States Patent Application Serial No. 09/498,271, filed Feb. 4, 
2000, entitled "Bayesian TV Show Recommender;' (Attorney Docket No. 700690) and 
United States Patent Application Serial No. 09/627,139, filed July 27, 2000, entitled "Three- 
Way Media Recommendation Method and System," (Attorney Docket No. 700913), or any 
combination thereof, as modified herein to carry out the features and functions of the present 
invention. 

FIG. 2 is a sample table from the program database 200 of FIG.' 1 that records 
information for each program that is available in a given time interval. As shown in FIG. 2, 
the program database 200 contains a plurality of records, such as records 205 through 220, 
each associated with a given program. For each program, the program database 200 indicates 
the date/time and channel associated with the program in fields 240 and 245, respectively. In 
addition, the title, genre and actors for each program are identified in fields 250, 255 and 270, 
respectively. Additional well-known features (not shown), such as duration, and description 
of the program, can also be included in the program database 200. 

FIG. 3 A is a table illustrating an exemplary explicit viewer profile 300 that 
may be utilized by a Bayesian television recommender. Xs shown in FIG. 3 A, the explicit 
viewer profile 300 contains a plurality of records 305-313 each associated with a different 
program feature. In addition, for each feature set forth in column 340, the viewer profile 300 
provides a numerical representation in column 350, indicating the relative level of interest of 
the viewer in the corresponding feature. As discussed below, in the illustrative explicit 
viewer profile 300 set forth in FIG. 3A, a numerical scale between 1 ("hate") and 7 ("love") 
is utilized. For example, the explicit viewer profile 300 set forth in FIG. 3A has numerical 
representations indicating that the user particularly enjoys programming on the Sports 
channel, as well as late afternoon programming. 

In an exemplary embodiment, the numerical representation in the explicit 
viewer profile 300 includes an intensity scale such as: 
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Number 


Description 


1 


Hates 


2 


Dislikes 


3 


Moderately negative 


4 


Neutral 


5 


Moderately positive 


6 


Likes 


7 


Loves 



FIG. 3B is a table illustrating an exemplary viewing history 360 that is 
maintained by a decision tree television recommender. As shown in FIG. 3B, the viewing 
history 360 contains a plurality of records 361-369 each associated with a different program. 
5 In addition, for each program, the viewing history 360 identifies various program features in 
fields 370-379. The values set forth in fields 370-379 may be typically obtained from the 
electronic program guide 130. It is noted that if the electronic program guide 130 does not 
specify a given feature for a given program, the value is specified in the viewing history 360 
using a"?". 

10 FIG. 3C is a table illustrating an exemplary viewer profile 300' that may be 

generated by a decision tree television recommender from the viewing history 360 set forth in 
FIG. 3B. As shown in FIG. 3C, the decision tree viewer profile 300' contains a plurality of 
records 381-384 each associated with a different rule specifying viewer preferences. In 
addition, for each rule identified in column 390, the viewer profile 300' identifies the 

1 5 conditions associated with the rule in field 391 and the corresponding recommendation in 
field 392. 

For a more detailed discussion of the generating of viewer profiles in a 
decision tree recommendation system, see, for example, United States Patent Application 
Serial No. 09/466,406, filed December 17, 1999, entitled "Method and Apparatus for 

20 Recommending Television Programming Using Decision Trees," (Attorney Docket No. 
700772), incorporated by reference above. 

FIG. 4 is a flow chart describing an exemplary auditory and gestural feedback 
analysis process 400. The auditory and gestural feedback analysis process 400 may be 
initiated, for example, during step 410 upon the occurrence of a predefined event, such as the 

25 end of a watched program, the selection of a new channel, or the detection of predefined 
auditory or gestural feedback commands. 
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Thus, a test is performed during step 410 to determine if a predefined event 
has occurred to initiate the process 400. In the illustrative implementation of the auditory and 
gestural feedback analysis process 400, the predefined event may be system-initiated, for 
example, corresponding to the end of a watched program or the selection of a new channel, or 
user-initiated, for example, corresponding to the voluntary provision of auditory or gestural 
feedback information. It is further noted that the user-initiated auditory or gestural feedback 
behavior may be affirmative, such as the user indicating to the system 100 that a particular 
program was liked or disliked, or passive, such as the system deriving that a particular 
program was liked or disliked through facial expressions of the user. 

If it is determined during step 4 1 0 that a predefined initiation event has not 
occurred, then program control returns to step 410 until such a predefined event occurs. If, 
however, it is determined during step 410 that a predefined initiation event has occurred, then 
a further test is performed during step 420 to determine if the detected predefined event 
corresponds to the end of a watched program or selection of a new program. In other words, 
the exemplary test performed during step 420 determines if the predefined event is system- 
initiated or user-initiated. 

If it is determined during step 420 that the detected predefined event 
corresponds to the end of a watched program or selection of a new program (or another 
system-initiated event), then the user is queried for the desired feedback on the program that 
was just watched during step 430. For example, the query may request the user to rate a 
program that was just watched, or a particular program feature associated with the watched 
program. Thereafter, the auditory and gestural feedback analysis process 400 receives the 
user's auditory or gestural feedback response from during step 440. 

If, however, it is determined during step 420 that the detected predefined event 
does not correspond to the end of a watched program or selection of a new program (or 
another system-initiated event), then the detected predefined event must be a user-initiated 
feedback event. 

The system-initiated auditory or gestural feedback or the user-initiated 
auditory or gestural feedback is processed during step 450 to translate the auditory or gestural 
feedback to a numerical representation indicating the strength of the user's like or dislike of 
the indicated program (or program feature). Thereafter, the viewer profile 300 is updated 
during step 460 with the numerical representation indicating the strength of the user's like or 
dislike, before program control terminates, in a manner discussed further below. 
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As previously indicated, the auditory or gestural feedback can include (i) 
auditory commands, (ii) gestural commands, (iii) facial expressions, or (iv) a combination of 
the foregoing. The auditory commands processed by the auditory and gestural feedback 
analysis process 400 can include, for example, a number of auditory sounds, such as a clap, 
whistle or knocking, each mapped to the illustrative numerical scale between 1 ("hate") and 7 
("love"). In a further variation, the auditory commands can include recognizing the spoken 
words (or corresponding number) corresponding to the illustrative numerical scale between 1 
("hate") and 7 ("love"). 

Likewise, the gestural commands can include a number of gestural acts, such 
as raising a finger, hand or arm to various positions, or adjusting the number of the user's 
fingers in an up or down position to various configurations, each mapped to the illustrative 
numerical scale between 1 ("hate") and 7 ("love"). In a further variation, the gestural 
commands can include recognizing the user pointing to a selection from a list of the 
illustrative numerical scale between 1 ("hate") and 7 ("love") presented on the display 160. 

The facial expression of the user can also be processed to derive whether or 
not the viewer liked or disliked a given program. For example, a positive or negative facial 
expression from the user while watching a program typically indicates whether the viewer 
liked or disliked the program. In a further variation, the intensity of the facial expression can 
be determined and varying degrees of facial expression can be mapped to the illustrative 
numerical scale between 1 ("hate") and 7 ("love"). The facial expression may be obtained, 
for example, in accordance with the techniques described in "Facial Analysis from 
Continuous Video with Application to Human-Computer Interface," Ph.D. Dissertation, 
University of Illinois at Urbana-Champaign (1999); or Antonio Colmenarez et al., "A 
Probabilistic Framework for Embedded Face and Facial'Expression Recognition," Proc. of 
the Int'l Conf. on Computer Vision and Partem Recognition," Vol. I, 592-97, Fort Collins, 
Colorado (1999), each incorporated by reference herein. The intensity of the facial 
expression may be obtained, for example, in accordance with the techniques described in 
United States Patent Application Serial Number 09/705,666, filed November 3, 2000, entitled 
"Estimation of Facial Expression Intensity Using a Bi-Directional Star Topology Hidden 
Markov Model," (Attorney Docket No. 701253), assigned to the assignee of the present 
invention and incorporated by reference herein. 

As previously indicated, the viewer profile 300 or 300' is updated during step 
460 of the auditory and gestural feedback analysis process 400 with the numerical 
representation indicating the strength of the user's like or dislike. More specifically, the 
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explicit viewer profile 300 of FIG. 3 A can be updated, for example, by replacing the previous 
recorded value(s) with the newly obtained numerical representation indicating the strength of 
the user's like or dislike. Alternatively, the previous recorded value(s) with can be replaced 
with a moving average over a predefined time window or using an averaging scheme that 
assigns a higher weight to more recent scores. In a decision tree implementation, the viewer 
profile 300' of FIG. 3C can be updated by adding the watched program to the viewing history 
360 and rebuilding the profile 300'. Alternatively, the strength of the user's like or dislike 
can be added directly to the viewer profile 300' by identifying each rule satisfied by the new 
program and adjusting the corresponding rule score in the following manner: 



10 New Score = Current Score + 



1 New Program % ^^3*^ 

Total # Programs Covered by Rule 



In an implicit Bayesian recommender system, the implicit viewer profile (not 
shown) can be updated by treating a positive feedback from the user as if the viewer watched 
the program and incrementing the positive feature counts. Likewise, negative feedback from 
the user can be treated as if the viewer had not watched the program and incrementing the 

15 negative feature counts. 

It is to be understood that the embodiments and variations shown and 
described herein are merely illustrative of the principles of this . invention and that various 
modifications may be implemented by those skilled in the art without departing from the 
scope and spirit of the invention. 
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1 . A method for updating a user profile (300), comprising the steps of: 
obtaining said user profile (300) indicating preferences of a user; 
analyzing at least one of audio and video information focused on said user to 

identify predefined behavioral feedback indicating preferences of said user; and 
5 updating said user profile (300) based on said predefined behavioral feedback. 

2. The method of claim 1 , wherein said user profile (300) is associated with a 
program content recommender (100). 

10 3 The me thod of claim 1, wherein said predefined behavioral feedback includes 

auditory commands. 

4. The method of claim 3, wherein said auditory commands include one of a 
number of auditory sounds each mapped to a numerical scale corresponding to a strength of 

1 5 said preference of said user. 

5. The method of claim 1 , wherein said predefined behavioral feedback includes 
gestural commands. 

. ! 

20 6. The method of claim 5, wherein said gestural commands include one of a 

number of gestural acts each mapped to a numerical scale corresponding to a strength of said 
preference of said user. 

7. The method of claim 5, wherein said gestural commands include pointing to a 
25 selection from a list of the illustrative numerical scale between 1 ("hate") and 7 ("love") 

presented on a display. 

8. The method of claim 1 , wherein said predefined behavioral feedback includes 
deriving said user preferences from a facial expression of said user. 
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9. The method of claim 1, further comprising the step of requesting feedback 
information from said user. 

10. A system (100) for updating a user profile (300), comprising: 

a memory (1 1 0) for storing computer readable code and said user profile 

(300); and 

a processor (120) operatively coupled to said memory (1 10), said processor 

(120) configured to: 

obtain said user profile (300) indicating preferences of a user; 

analyze at least one of audio and video information focused on said user to 
identify predefined behavioral feedback indicating preferences of said user; and 

update said user profile (300) based on said predefined behavioral feedback. 

11. A computer readable medium having computer readable code means 
embodied thereon, said computer readable program code means comprising: 

a step to obtain A user profile (300) indicating viewing preferences of a user; 

a step to analyze at least one of audio and video information focused on said 
user to identify predefined behavioral feedback indicating viewing preferences of said user; 
and 

a step to update said viewer profile (300) based on said predefined behavioral 

feedback. 
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